Skip to content

Add clustered migration sync for shared disks (SYNCING barrier)#403

Open
fabi200123 wants to merge 1 commit into
cloudbase:masterfrom
fabi200123:cor-707
Open

Add clustered migration sync for shared disks (SYNCING barrier)#403
fabi200123 wants to merge 1 commit into
cloudbase:masterfrom
fabi200123:cor-707

Conversation

@fabi200123

@fabi200123 fabi200123 commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

This PR adds core support for the shared disks clustered transfers:

  • Adds a new SYNCING task status and a TASK_TYPES_TO_SYNC list (GET_INSTANCE_INFO, SHUTDOWN_INSTANCE). (These tasks wait for their peers across all instances of a clustered transfer before completing)
  • Adds a clustered flag on BaseTransferAction, set automatically when a transfer has more than one instance.
  • After all GET_INSTANCE_INFO tasks sync, the conductor assigns an owner (new property in the VM export info schema) to every disk in each instance's export_info (first instance reporting a disk id owns it).
  • REPLICATE_DISKS now passes the full volumes_info to the source provider (skipping shared disks is handled provider-side).
  • If a synced task fails, peer tasks waiting in SYNCING are failed too (cancellation handles SYNCING tasks like pending ones).

Comment thread coriolis/tasks/replica_tasks.py Outdated
Comment thread coriolis/conductor/rpc/client.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
@fabi200123 fabi200123 force-pushed the cor-707 branch 4 times, most recently from 9813d2b to 61ebc94 Compare April 9, 2026 01:21
Comment thread coriolis/conductor/rpc/server.py
Comment thread coriolis/utils.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/schemas/disk_sync_resources_info_schema.json Outdated
Comment thread coriolis/tasks/replica_tasks.py Outdated
@fabi200123 fabi200123 force-pushed the cor-707 branch 3 times, most recently from 173c46b to 6433f4e Compare May 13, 2026 13:12
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/conductor/rpc/server.py Outdated
Comment thread coriolis/tasks/replica_tasks.py Outdated
Comment thread coriolis/tasks/replica_tasks.py Outdated
Comment thread coriolis/utils.py Outdated
@fabi200123 fabi200123 force-pushed the cor-707 branch 2 times, most recently from 6e211ad to f1476b1 Compare June 26, 2026 07:46
Comment thread coriolis/conductor/rpc/server.py Outdated
for disk_id in shared_disk_ids:
ident = utils.cluster_disk_identity(disk_id)
owner_id = owners.get(ident)
if owner_id == instance_id or not owner_id:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be guaranteed that we get an owner_id? I feel like this should error out if there's no owner returned.

Comment thread coriolis/conductor/rpc/server.py Outdated
target_vol = vol
break
if target_vol is None:
norm_wid = utils.cluster_disk_identity(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use ident instead of assigning a new norm_wid? It's the same value.

Comment thread coriolis/conductor/rpc/server.py Outdated
if target_vol is None:
norm_wid = utils.cluster_disk_identity(
disk_id)
for vol in volumes_info:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This for loop is pointless. The same thing is already checked above (since ident == norm_wid).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants